Website Tech Stack Detector — Technographics by Domain
DeprecatedPricing
from $8.50 / 1,000 domain profileds
Website Tech Stack Detector — Technographics by Domain
DeprecatedDetect the technologies a website runs — CMS, ecommerce platform, analytics, tag managers, JS frameworks, CDN, payment, and marketing tools. Give a list of company domains; get a normalized JSON tech profile per site for B2B sales targeting and competitive technographic research.
Pricing
from $8.50 / 1,000 domain profileds
Rating
0.0
(0)
Developer
Scott Helvick
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 hours ago
Last modified
Categories
Share
Find out what technology any website runs. Give this Actor a list of company domains and it returns a normalized JSON technology profile per site — the CMS, ecommerce platform, analytics and tag managers, JavaScript frameworks, CDN, payment providers, marketing and support tools, and more — each with the evidence that identified it. Built for B2B sales targeting, competitive research, and technographic datasets, and callable directly by an AI agent.
What this does
- Takes a list of domains or URLs (1–100 per run) and profiles each site's homepage.
- Detects technologies across ~20 categories: CMS (WordPress, Drupal, Webflow, Wix, …), ecommerce platform (Shopify, WooCommerce, Magento, BigCommerce, …), analytics (Google Analytics, Hotjar, Segment, Mixpanel, …), tag managers, JavaScript frameworks (React, Next.js, Vue, Angular, …), CDN / hosting, web server / language, marketing automation (HubSpot, Marketo, Klaviyo, …), customer support / chat, advertising pixels, A/B testing, payment (Stripe, PayPal, Braintree, …), cookie consent, search, and video.
- Returns each detected technology with a category, a confidence (high for a specific signature like a script URL or generator tag; medium for a generic HTML pattern), and the concrete evidence that matched.
- Groups results by category and gives a per-site technology count for fast scanning.
- Profiles are deterministic — the same page yields the same result every time. No model guesses what a site "probably" runs.
Use it to:
- Build B2B prospect lists filtered by technology (e.g. "every site running Shopify and Klaviyo").
- Score sales leads by the tools they already use.
- Run competitive research on what a set of competitors' sites are built with.
- Track migrations — re-run a domain list over time to see platform changes.
- Feed an AI sales-research agent structured technographic data without maintaining your own signature database.
Why deterministic, signature-based detection matters
Technographic data drives outreach and spend decisions, so a wrong answer is worse than no answer. Every technology this Actor reports is a verbatim signature match against the page's own markup — a script URL, a generator meta tag, a framework marker, a response header — and the matching evidence travels with each result so you can audit it. There is no language model in the detection path inventing a plausible-but-wrong stack. A useful side effect: because the work is fetch-plus-pattern-match with no inference bill, the cost floor is tiny, so the price reflects the lookup, not a model call.
How it compares to the alternatives
| Approach | Normalized categories | Evidence per match | Bulk by domain | Agent-callable |
|---|---|---|---|---|
| Subscription technology-lookup services | yes | rarely | yes | via their own API/plan |
| Roll-your-own page parsing | you build it | you build it | you build it | you build it |
| Website Tech Stack Detector | yes | yes | yes (1–100/run) | yes |
The honest framing: you can parse pages and maintain a signature set yourself — this Actor is for when you'd rather not own that, and want a stable JSON contract you can point an agent or a pipeline at. Subscription technology databases are the alternative when you need a multi-year history or a firmographic overlay and don't mind a per-seat plan; this Actor is the better fit for on-demand, pay-per-domain lookups wired into your own workflow.
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
domains | array of strings | yes | — | Company domains or full URLs to profile (1–100). Accepts example.com, www.example.com, or https://example.com/path; each is normalized to an https homepage fetch. |
deepRender | boolean | false | Force a full JavaScript render on every domain to catch technologies injected by client-side scripts. Off by default — raw HTML already contains the loader tags for the large majority of technologies, and the fetch escalates to a render automatically when a site blocks the plain request. |
One dataset record is produced per input domain.
Output
One record per domain. Nullable fields are null on a failed record.
{"identifier": "shopify.com","status": "completed","url": "https://www.shopify.com/","domain": "shopify.com","technologies": [{"name": "Shopify","category": "Ecommerce","confidence": "high","evidence": "URL: https://cdn.shopify.com/s/..."},{"name": "Google Analytics","category": "Analytics","confidence": "high","evidence": "URL: https://www.googletagmanager.com/gtag/js?id=G-..."}],"categories": {"Ecommerce": ["Shopify"],"Analytics": ["Google Analytics"]},"technologyCount": 2,"realizedTier": "basic","error": null,"notice": "Technologies are inferred from publicly served page markup via signature matching; detection is best-effort and provided as-is, not a guarantee of what a site runs."}
A failed record carries status: "failed", an error tag (e.g.
fetch-failed, empty-response, invalid-domain), and an empty
technologies list.
Example
Profile three companies' sites:
{"domains": ["shopify.com", "stripe.com", "wordpress.org"]}
curl -X POST "https://api.apify.com/v2/acts/shelvick~website-tech-stack-detector/run-sync-get-dataset-items?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"domains":["shopify.com","stripe.com","wordpress.org"]}'
from apify_client import ApifyClientclient = ApifyClient("YOUR_TOKEN")run = client.actor("shelvick/website-tech-stack-detector").call(run_input={"domains": ["shopify.com", "stripe.com", "wordpress.org"]})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["identifier"], item["technologyCount"], list((item.get("categories") or {}).keys()))
Calling from an AI agent
Apify MCP server (mcp.apify.com) — the Actor is exposed as a callable tool
whose input schema is self-documenting, so an LLM can construct a valid call from
the tool description alone (domains in; technology profiles out). Pay per call
via x402 (USDC on Base) or Skyfire managed tokens.
Apify SDK (Python) — from apify_client import ApifyClient, then
client.actor("shelvick/website-tech-stack-detector").call(run_input=...) and
iterate the dataset (see above).
REST API — POST /v2/acts/shelvick~website-tech-stack-detector/run-sync-get-dataset-items?token=...
for synchronous runs; the async /runs endpoint for large domain lists that may
exceed the 5-minute sync window.
Pricing
Pay-per-event, billed only on success: one charge per domain that is fetched and analyzed, after its record is pushed to the dataset. Domains that fail to fetch — or that are too heavily bot-walled to reach — are free. Because billing is per domain, your domain-list length is your spend cap.
See the Pricing tab on this Store page for the current per-domain rate and any active subscriber discounts.
Design notes: www.scotthelvick.com/tools/website-tech-stack-detector
Behavior
Run-level failures (rare) — input validation only: an empty domains list
or more than 100 entries is rejected before any work.
Per-domain outcomes (common) — each domain yields a record; failures are isolated and never charged:
invalid-domain— the input had no usable hostname.fetch-failed— the site could not be reached (or is bot-walled beyond the rendered-fetch tier).challenge-blocked— the fetch returned a bot-wall / CAPTCHA challenge page rather than the real site (only CDN/security markers were present), so it is reported as blocked instead of a misleading thin profile.empty-response— the fetch returned no usable HTML.
A domain that is reached successfully but matches no known signature returns
status: "completed" with an empty technology list — a valid answer (the site
uses none of the detected technologies), and it is charged like any successful
profile.
Performance — one homepage fetch per domain, raw HTML first with an automatic escalation to a rendered fetch when a site needs it. Domains are processed concurrently, so a small list finishes in seconds; a 100-domain list runs longer and may need the async endpoint rather than the 5-minute sync window.
FAQ
Which technologies can it detect? Around 95 technologies across ~20 categories — CMS, ecommerce platforms, analytics, tag managers, JavaScript frameworks, CDN/hosting, web servers and languages, marketing automation, support/chat, ad pixels, A/B testing, payment, cookie consent, search, and video. The set favors the highest-signal, most common technologies and grows over time.
Why did a site I know uses tool X not show it?
Some technologies are injected only after client-side scripts run; enable
deepRender to force a full render. Others leave no detectable public signature,
and a few sites are bot-walled beyond the rendered-fetch tier (those return a
failed record and aren't charged).
What does "confidence" mean?
high = a specific signature matched (a third-party script URL, a generator
meta tag, a response header, or a cookie). medium = a generic HTML pattern
matched. Every result includes the exact evidence so you can verify it.
Can I pass full URLs, not just domains?
Yes — bare domains, www. hostnames, and full URLs are all accepted; each is
normalized to an https homepage fetch.
Is this only the homepage? Yes. One homepage fetch per domain covers site-wide technologies (most tags load on every page) and keeps the cost predictable. Per-page crawling is out of scope.
What this doesn't do
- No deep crawl. It profiles the homepage, not every page of a site.
- No technology history or version timelines. It reports the current state, not when a site adopted or dropped a tool.
- No firmographics or contact data. It returns the tech stack, not company size, revenue, or email addresses.
- No market-share rankings. It profiles the domains you give it; it doesn't tell you how popular a technology is overall.
- No authenticated or paywalled pages. Public homepage markup only.
For an aggregated competitive landscape of local businesses (counts, ratings, saturation) use a local-market analysis Actor instead. For turning a list of URLs into arbitrary structured fields against your own schema, use a structured web-extraction Actor. For fetching the raw page content itself in multiple formats, use an adaptive page-fetching Actor.